Learning Syntactic Patterns for Automatic Hypernym Discovery
نویسندگان
چکیده
Semantic taxonomies such as WordNet provide a rich source of knowledge for natural language processing applications, but are expensive to build, maintain, and extend. Motivated by the problem of automatically constructing and extending such taxonomies, in this paper we present a new algorithm for automatically learning hypernym (is-a) relations from text. Our method generalizes earlier work that had relied on using small numbers of hand-crafted regular expression patterns to identify hypernym pairs. Using “dependency path” features extracted from parse trees, we introduce a general-purpose formalization and generalization of these patterns. Given a training set of text containing known hypernym pairs, our algorithm automatically extracts useful dependency paths and applies them to new corpora to identify novel pairs. On our evaluation task (determining whether two nouns in a news article participate in a hypernym relationship), our automatically extracted database of hypernyms attains both higher precision and higher recall than WordNet.
منابع مشابه
Automatic Extraction of Turkish Hypernym-Hyponym Pairs From Large Corpus
In this paper, we propose a fully automatic system for acquisition of hypernym/hyponymy relations from large corpus in Turkish Language. The method relies on both lexico-syntactic pattern and semantic similarity. Once the model has extracted the seeds by using patterns, it applies similarity based expansion in order to increase recall. For the expansion, several scoring functions within a boots...
متن کاملHypernym-Hyponym Acquisition-A Juxtapose Study
Relationship acquisition has always contributed as an important enhancement to Natural Language Processing. Using “definition extraction” methods proved to be a very useful approach, not only for relationship extraction but also for enhancing the limited coverage of words in WordNet. It also played a major role in developing question answer systems and ontology learning for unrestricted data. V...
متن کاملCS 224N Class Project Automatic Hypernym Classification
Hypernym classification is the task of deciding whether, given two words, one word “is a kind of” the other. We present a classifier that learns the noun hypernym relation based on automatically-discovered lexico-syntactic patterns between a set of provided hyponym/hypernym noun pairs. This classifier is shown to outperform two previous methods for automatically identifying hypernym pairs (usin...
متن کاملWikipedia as the Premiere Source for Targeted Hypernym Discovery
Targeted Hypernym Discovery (THD) applies lexico-syntactic (Hearst) patterns on a suitable corpus with the intent to extract one hypernym at a time. Using Wikipedia as the corpus in THD has recently yielded promising results in a number of tasks. We investigate the reasons that make Wikipedia articles such an easy target for lexicosyntactic patterns, and suggest that it is primarily the adheren...
متن کاملSupervised Learning of Syntactic Contexts for Uncovering Definitions and Extracting Hypernym Relations in Text Databases
In this paper we address the problem of automatically constructing structured knowledge from plain texts. In particular, we present a supervised learning technique to first identify definitions in text data, while then finding hypernym relations within them making use of extracted syntactic structures. Instead of using pattern matching methods that rely on lexico-syntactic patterns, we propose ...
متن کامل